WEIGHTED k-NN GRAPHS FOR RÉNYI ENTROPY ESTIMATION IN HIGH DIMENSIONS
نویسندگان
چکیده
Rényi entropy is an information-theoretic measure of randomness which is fundamental to several applications. Several estimators of Rényi entropy based on k-nearest neighbor (kNN) based distances have been proposed in literature. For d-dimensional densities f , the variance of these Rényi entropy estimators of f decay as O(M), whereM is the sample size drawn from f . On the other hand, the bias, because of the curse of dimensionality, decays as O(M). As a result the bias dominates the mean square error (MSE) in high dimensions. To address this large bias in high dimensions, we propose a weighted k-NN estimator where the weights serve to lower the bias to O(M), which then ensures convergence of the weighted estimator at the parametric rate of O(M). These weights are determined by solving a convex optimization problem. We subsequently use the weighted estimator to perform anomaly detection in wireless sensor networks.
منابع مشابه
Rényi entropy dimension of the mixture of measures
Rényi entropy dimension describes the rate of growth of coding cost in the process of lossy data compression in the case of exponential dependence between the code length and the cost of coding. In this paper we generalize the Csiszár estimation of the Rényi entropy dimension of the mixture of measures for the case of general probability metric space. This result determines the cost of encoding...
متن کاملImprovement of the k-nn Entropy Estimator with Applications in Systems Biology
In this paper, we investigate efficient estimation of differential entropy for multivariate random variables. We propose bias correction for the nearest neighbor estimator, which yields more accurate results in higher dimensions. In order to demonstrate the accuracy of the improvement, we calculated the corrected estimator for several families of random variables. For multivariate distributions...
متن کاملHigh-Dimensional Entropy Estimation for Finite Accuracy Data: R-NN Entropy Estimator
We address the problem of entropy estimation for high-dimensional finite-accuracy data. Our main application is evaluating high-order mutual information image similarity criteria for multimodal image registration. The basis of our method is an estimator based on k-th nearest neighbor (NN) distances, modified so that only distances greater than some constant R are evaluated. This modification re...
متن کاملMaximum-Entropy Parameter Estimation for the k-nn Modified Value-Difference Kernel
We introduce an extension of the modified value-difference kernel of k-nn by replacing the kernel’s default class distribution matrix with the matrix produced by the maximum-entropy learning algorithm. This hybrid algorithm is tested on fifteen machine learning benchmark tasks, comparing the hybrid to standard k-nn classification and maximum-entropy-based classification. Results show that the h...
متن کاملFast Parallel Estimation of High Dimensional Information Theoretical Quantities with Low Dimensional Random Projection Ensembles
Goal: estimation of high dimensional information theoretical quantities (entropy, mutual information, divergence). • Problem: computation/estimation is quite slow. • Consistent estimation is possible by nearest neighbor (NN) methods [1] → pairwise distances of sample points: – expensive in high dimensions [2], – approximate isometric embedding into low dimension is possible (Johnson-Lindenstrau...
متن کامل